Language models are widely deployed to provide automatic text completion services in user products. However, recent research has revealed that language models (especially large ones) bear considerable risk of memorizing private training data, which is then vulnerable to leakage and extraction by adversaries. In this study, we test the efficacy of a range of privacy-preserving techniques to mitigate unintended memorization of sensitive user text, while varying other factors such as model size and adversarial conditions. We test both "heuristic" mitigations (those without formal privacy guarantees) and Differentially Private training, which provides provable levels of privacy at the cost of some model performance. Our experiments show that (with the exception of L2 regularization), heuristic mitigations are largely ineffective in preventing memorization in our test suite, possibly because they make too strong of assumptions about the characteristics that define "sensitive" or "private" text. In contrast, Differential Privacy reliably prevents memorization in our experiments, despite its computational and model-performance costs.
translated by 谷歌翻译
We propose a deep learning method for three-dimensional reconstruction in low-dose helical cone-beam computed tomography. We reconstruct the volume directly, i.e., not from 2D slices, guaranteeing consistency along all axes. In a crucial step beyond prior work, we train our model in a self-supervised manner in the projection domain using noisy 2D projection data, without relying on 3D reference data or the output of a reference reconstruction method. This means the fidelity of our results is not limited by the quality and availability of such data. We evaluate our method on real helical cone-beam projections and simulated phantoms. Our reconstructions are sharper and less noisy than those of previous methods, and several decibels better in quantitative PSNR measurements. When applied to full-dose data, our method produces high-quality results orders of magnitude faster than iterative techniques.
translated by 谷歌翻译
Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. Starting from random noise, such text-to-image diffusion models gradually synthesize images in an iterative fashion while conditioning on text prompts. We find that their synthesis behavior qualitatively changes throughout this process: Early in sampling, generation strongly relies on the text prompt to generate text-aligned content, while later, the text conditioning is almost entirely ignored. This suggests that sharing model parameters throughout the entire generation process may not be ideal. Therefore, in contrast to existing works, we propose to train an ensemble of text-to-image diffusion models specialized for different synthesis stages. To maintain training efficiency, we initially train a single model, which is then split into specialized models that are trained for the specific stages of the iterative generation process. Our ensemble of diffusion models, called eDiff-I, results in improved text alignment while maintaining the same inference computation cost and preserving high visual quality, outperforming previous large-scale text-to-image diffusion models on the standard benchmark. In addition, we train our model to exploit a variety of embeddings for conditioning, including the T5 text, CLIP text, and CLIP image embeddings. We show that these different embeddings lead to different behaviors. Notably, the CLIP image embedding allows an intuitive way of transferring the style of a reference image to the target text-to-image output. Lastly, we show a technique that enables eDiff-I's "paint-with-words" capability. A user can select the word in the input text and paint it in a canvas to control the output, which is very handy for crafting the desired image in mind. The project page is available at https://deepimagination.cc/eDiff-I/
translated by 谷歌翻译
延时图像序列提供了对动态过程的视觉吸引人的见解,这些过程太慢,无法实时观察。但是,由于天气(例如天气)以及循环效应(例如昼夜周期),播放长时间的序列通常会导致分散注意力的闪烁。我们以一种允许单独的,事后控制整体趋势,环状效应和图像中随机效应的方式介绍了解散延时序列的问题,并描述了基于数据驱动的生成模型的技术这个目标。这使我们能够以仅输入图像不可能的方式“重新渲染”序列。例如,在可选的,一致的天气下,我们可以稳定长序列,以重点关注植物的生长。我们的方法基于生成对抗网络(GAN),这些网络(GAN)以延时序列的时间坐标为条件。我们设计了我们的体系结构和培训程序,以便网络学会为随机变化(例如天气,使用GAN的潜在空间)建模,并通过使用特定频率的傅立叶功能将调理时间标签馈送到模型中,从而消除整体趋势和周期性变化。 。我们表明,我们的模型对于训练数据中的缺陷是可靠的,使我们能够修改捕获长时间序列的一些实际困难,例如临时遮挡,不均匀的框架间距和缺失框架。
translated by 谷歌翻译
自动识别基础心脏异常的结构底物可以潜在地为介入程序提供实时指导。有了心脏组织底物的了解,可以通过检测心律不齐的底物来进一步优化复杂的心律不齐和心室心动过速等复杂的心律不齐和心室心动过速。光学相干断层扫描(OCT)是一种实时成像方式,有助于满足这一需求。心脏图像分析的现有方法主要依赖于完全监督的学习技术,这些技术遇到了在像素标签的劳动密集型注释过程中工作量的缺点。为了减少对像素标签的需求,我们使用人类心脏底物的OCT图像上的图像级注释开发了一个两阶段的深度学习框架,用于心脏脂肪组织分割。特别是,我们将类激活映射与超像素分割整合在一起,以解决心脏组织分割中提出的稀疏组织种子挑战。我们的研究弥合了自动组织分析的需求与缺乏高质量像素的注释之间的差距。据我们所知,这是第一项尝试通过弱监督的学习技术来解决OCT图像上心脏组织分割的研究。在体外人类心脏OCT数据集中,我们证明了我们对图像级注释的弱监督方法可与对像素式注释进行训练的完全监督方法相当。
translated by 谷歌翻译
嵌入或可视化临床患者数据的主要挑战是可变类型的异质性,包括连续实验室值,分类诊断代码以及缺失或不完整的数据。特别地,在EHR数据中,一些变量是{\ EM缺失而不是随机(MNAR)}但故意没有收集,因此是信息来源。例如,在疑似诊断的基础上,某些患者可能认为实验室测试是必要的,但不适用于其他患者。在这里,我们呈现壁画林 - 一个无监督的随机森林,用于代表具有不同变量类型的数据(例如,分类,连续,mnar)。壁画森林由一组决策树组成,其中随机选择节点分裂变量,使得所有其他变量的边缘熵由分裂最小化。这允许我们在与连续变量一致的方式中也拆分在Mnar变量和离散变量上。最终目标是学习使用这些患者之间的平均树距离的患者的壁画嵌入。这些距离可以馈送到非线性维度减少方法,如phate,以获得可视化的嵌入。虽然这种方法在连续值的数据集中普遍存在(如单细胞RNA测序)中,但它们尚未在混合可变数据中广泛使用。我们展示在一个人工和两个临床数据集上使用我们的方法。我们表明,使用我们的方法,我们可以比竞争方法更准确地对数据进行可视化和分类数据。最后,我们表明壁画也可用于通过最近提出的树木切片的Wassersein距离比较患者的群组。
translated by 谷歌翻译
探讨了将数据驱动对象检测器的不确定性结合到对象跟踪算法中的不确定性的方法。对象跟踪方法依赖于测量误差模型,通常以测量噪声,假阳性率和错过检测速率的形式。通常,这些数量通常可以取决于物体或测量位置。然而,对于从神经网络处理的摄像机输入产生的检测,这些测量误差统计不足以表示主要错误源,即运行时传感器输入与检测器训练的训练数据之间的不相似性。为此,我们调查将数据不确定性纳入物体跟踪方法,例如提高跟踪物体的能力,特别是那些超出的能力。培训数据。所提出的方法在对象跟踪基准上验证以及具有真正自治飞机的实验。
translated by 谷歌翻译
Training generative adversarial networks (GAN) using too little data typically leads to discriminator overfitting, causing training to diverge. We propose an adaptive discriminator augmentation mechanism that significantly stabilizes training in limited data regimes. The approach does not require changes to loss functions or network architectures, and is applicable both when training from scratch and when fine-tuning an existing GAN on another dataset. We demonstrate, on several datasets, that good results are now possible using only a few thousand training images, often matching StyleGAN2 results with an order of magnitude fewer images. We expect this to open up new application domains for GANs. We also find that the widely used CIFAR-10 is, in fact, a limited data benchmark, and improve the record FID from 5.59 to 2.42.
translated by 谷歌翻译
The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training methods to address them. In particular, we redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images. In addition to improving image quality, this path length regularizer yields the additional benefit that the generator becomes significantly easier to invert. This makes it possible to reliably attribute a generated image to a particular network. We furthermore visualize how well the generator utilizes its output resolution, and identify a capacity problem, motivating us to train larger models for additional quality improvements. Overall, our improved model redefines the state of the art in unconditional image modeling, both in terms of existing distribution quality metrics as well as perceived image quality.
translated by 谷歌翻译
The ability to automatically estimate the quality and coverage of the samples produced by a generative model is a vital requirement for driving algorithm research. We present an evaluation metric that can separately and reliably measure both of these aspects in image generation tasks by forming explicit, non-parametric representations of the manifolds of real and generated data. We demonstrate the effectiveness of our metric in StyleGAN and BigGAN by providing several illustrative examples where existing metrics yield uninformative or contradictory results. Furthermore, we analyze multiple design variants of StyleGAN to better understand the relationships between the model architecture, training methods, and the properties of the resulting sample distribution. In the process, we identify new variants that improve the state-of-the-art. We also perform the first principled analysis of truncation methods and identify an improved method. Finally, we extend our metric to estimate the perceptual quality of individual samples, and use this to study latent space interpolations.
translated by 谷歌翻译